Clinical Acronym/Abbreviation Normalization using a Hybrid Approach
نویسندگان
چکیده
A unique characteristic of clinical text is the pervasive use of acronyms and abbreviations, which are often ambiguous. The ShARe/CLEF eHealth Evaluation Lab organized three shared tasks on clinical natural language processing (NLP) and information retrieval (IR) in 2013 and one of them was to normalize acronyms/abbreviations to UMLS concept unique identifiers (CUIs). This paper describes a hybrid system, which combines different Word Sense Disambiguation (WSD) methods and existing knowledge bases to normalize and encode clinical abbreviations. Our system achieved the best accuracy of 0.719 on the independent test set, which was ranked first in the challenge.
منابع مشابه
Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts
Text normalization is an important aspect of successful information retrieval from medical documents such as clinical notes, radiology reports and discharge summaries. In the medical domain, a significant part of the general problem of text normalization is abbreviation and acronym disambiguation. Numerous abbreviations are used routinely throughout such texts and knowing their meaning is criti...
متن کاملTask 2: ShARe/CLEF eHealth Evaluation Lab 2013
In this pilot study, we aimed to generate a reference standard of clinical acronyms and abbreviations normalized to concepts from a standardized, medical vocabulary for the ShARe/CLEF eHealth 2013 challenge. In this paper, we review prior text normalization shared tasks, reference standard generation approaches, and recent clinical acronym and abbreviation normalization research. We report inte...
متن کاملA Hybrid Approach Based on Higher Order Spectra for Clinical Recognition of Seizure and Epilepsy Using Brain Activity
Introduction: This paper proposes a reliable and efficient technique to recognize different epilepsy states, including healthy, interictal, and ictal states, using Electroencephalogram (EEG) signals. Methods: The proposed approach consists of pre-processing, feature extraction by higher order spectra, feature normalization, feature selection by genetic algorithm and ranking method, and classif...
متن کاملTweetNorm: Text Normalization on Italian Twitter Data
This paper addresses the issue of text normalization on non-standard Italian data. We present TweetNorm1, a system which normalizes Italian tweets in a way that the amount of microblog slang and distorted text appearance is drastically reduced and the normalized output has a much cleaner and more formal style. The paper shows that with a set of fixed language-independent rules and trained rules...
متن کاملNormalization of Abbreviations/Acronyms: THCIB at CLEF eHealth 2013 Task 2
This paper describes the THCIB systems that used in the ShARe/CLEF eHealth Lab 2013 task 2. We built a baseline system using open source software, and improve the performance by adding dictionaries. The dictionary is built from training set and web resource using the existing technologies. The experimental results show that adding dictionary of acronym/abbreviation can improve the performance s...
متن کامل